Overview

Dataset statistics

Number of variables14
Number of observations67762
Missing cells8
Missing cells (%)< 0.1%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory7.2 MiB
Average record size in memory112.0 B

Variable types

Numeric11
Categorical3

Alerts

date has a high cardinality: 105 distinct values High cardinality
time has a high cardinality: 40788 distinct values High cardinality
price is highly correlated with geo_lon and 3 other fieldsHigh correlation
geo_lon is highly correlated with price and 1 other fieldsHigh correlation
region is highly correlated with price and 1 other fieldsHigh correlation
level is highly correlated with levelsHigh correlation
levels is highly correlated with levelHigh correlation
rooms is highly correlated with areaHigh correlation
area is highly correlated with price and 2 other fieldsHigh correlation
kitchen_area is highly correlated with price and 1 other fieldsHigh correlation
geo_lon is highly correlated with regionHigh correlation
region is highly correlated with geo_lonHigh correlation
level is highly correlated with levelsHigh correlation
levels is highly correlated with levelHigh correlation
rooms is highly correlated with areaHigh correlation
area is highly correlated with rooms and 1 other fieldsHigh correlation
kitchen_area is highly correlated with areaHigh correlation
geo_lon is highly correlated with regionHigh correlation
region is highly correlated with geo_lonHigh correlation
rooms is highly correlated with areaHigh correlation
area is highly correlated with roomsHigh correlation
Unnamed: 0 is highly correlated with geo_lat and 2 other fieldsHigh correlation
geo_lat is highly correlated with Unnamed: 0 and 3 other fieldsHigh correlation
geo_lon is highly correlated with geo_lat and 1 other fieldsHigh correlation
region is highly correlated with Unnamed: 0 and 3 other fieldsHigh correlation
building_type is highly correlated with Unnamed: 0 and 3 other fieldsHigh correlation
level is highly correlated with levelsHigh correlation
levels is highly correlated with building_type and 1 other fieldsHigh correlation
price is highly skewed (γ1 = 66.27105033) Skewed
Unnamed: 0 is uniformly distributed Uniform
time is uniformly distributed Uniform
Unnamed: 0 has unique values Unique
building_type has 10025 (14.8%) zeros Zeros

Reproduction

Analysis started2022-02-18 15:54:19.880164
Analysis finished2022-02-18 15:55:05.018653
Duration45.14 seconds
Software versionpandas-profiling v3.1.1
Download configurationconfig.json

Variables

Unnamed: 0
Real number (ℝ≥0)

HIGH CORRELATION
UNIFORM
UNIQUE

Distinct67762
Distinct (%)100.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean42357.06037
Minimum0
Maximum84702
Zeros1
Zeros (%)< 0.1%
Negative0
Negative (%)0.0%
Memory size529.5 KiB

Quantile statistics

Minimum0
5-th percentile4233.05
Q121096.25
median42396.5
Q363561.75
95-th percentile80474.95
Maximum84702
Range84702
Interquartile range (IQR)42465.5

Descriptive statistics

Standard deviation24476.83556
Coefficient of variation (CV)0.5778690813
Kurtosis-1.203344286
Mean42357.06037
Median Absolute Deviation (MAD)21227.5
Skewness-0.0014222962
Sum2870199125
Variance599115479.3
MonotonicityNot monotonic
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
140401
 
< 0.1%
754491
 
< 0.1%
227701
 
< 0.1%
295581
 
< 0.1%
642791
 
< 0.1%
771721
 
< 0.1%
512401
 
< 0.1%
198801
 
< 0.1%
640861
 
< 0.1%
528851
 
< 0.1%
Other values (67752)67752
> 99.9%
ValueCountFrequency (%)
01
< 0.1%
11
< 0.1%
21
< 0.1%
31
< 0.1%
41
< 0.1%
61
< 0.1%
71
< 0.1%
81
< 0.1%
91
< 0.1%
111
< 0.1%
ValueCountFrequency (%)
847021
< 0.1%
847011
< 0.1%
847001
< 0.1%
846991
< 0.1%
846981
< 0.1%
846971
< 0.1%
846951
< 0.1%
846941
< 0.1%
846921
< 0.1%
846911
< 0.1%

price
Real number (ℝ≥0)

HIGH CORRELATION
SKEWED

Distinct13314
Distinct (%)19.6%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean4545642.519
Minimum800
Maximum1680000000
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size529.5 KiB

Quantile statistics

Minimum800
5-th percentile1070000
Q11800000
median2815000
Q34910849
95-th percentile11207901.2
Maximum1680000000
Range1679999200
Interquartile range (IQR)3110849

Descriptive statistics

Standard deviation17313994.15
Coefficient of variation (CV)3.808921197
Kurtosis5186.635777
Mean4545642.519
Median Absolute Deviation (MAD)1265000
Skewness66.27105033
Sum3.080218284 × 1011
Variance2.997743933 × 1014
MonotonicityNot monotonic
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
2500000785
 
1.2%
1700000784
 
1.2%
2200000733
 
1.1%
1750000724
 
1.1%
1850000714
 
1.1%
1500000710
 
1.0%
1800000706
 
1.0%
2100000705
 
1.0%
1650000695
 
1.0%
2600000681
 
1.0%
Other values (13304)60525
89.3%
ValueCountFrequency (%)
8001
 
< 0.1%
10001
 
< 0.1%
13001
 
< 0.1%
16501
 
< 0.1%
17501
 
< 0.1%
800001
 
< 0.1%
850001
 
< 0.1%
1000003
< 0.1%
1001001
 
< 0.1%
1111111
 
< 0.1%
ValueCountFrequency (%)
16800000001
 
< 0.1%
14518920004
< 0.1%
10034250006
< 0.1%
6000000001
 
< 0.1%
3750000001
 
< 0.1%
3600447401
 
< 0.1%
3300000001
 
< 0.1%
2500000001
 
< 0.1%
2302056521
 
< 0.1%
2294539131
 
< 0.1%

date
Categorical

HIGH CARDINALITY

Distinct105
Distinct (%)0.2%
Missing0
Missing (%)0.0%
Memory size529.5 KiB
2018-09-18
13913 
2018-09-10
6893 
2018-09-17
6772 
2018-09-13
6588 
2018-09-14
6306 
Other values (100)
27290 

Length

Max length10
Median length10
Mean length10
Min length10

Characters and Unicode

Total characters677620
Distinct characters11
Distinct categories2 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique45 ?
Unique (%)0.1%

Sample

1st row2018-09-10
2nd row2018-09-11
3rd row2018-09-18
4th row2018-09-12
5th row2018-09-18

Common Values

ValueCountFrequency (%)
2018-09-1813913
20.5%
2018-09-106893
10.2%
2018-09-176772
10.0%
2018-09-136588
9.7%
2018-09-146306
9.3%
2018-09-115984
8.8%
2018-09-125182
 
7.6%
2018-09-154501
 
6.6%
2018-09-093899
 
5.8%
2018-09-163896
 
5.7%
Other values (95)3828
 
5.6%

Length

Histogram of lengths of the category
ValueCountFrequency (%)
2018-09-1813913
20.5%
2018-09-106893
10.2%
2018-09-176772
10.0%
2018-09-136588
9.7%
2018-09-146306
9.3%
2018-09-115984
8.8%
2018-09-125182
 
7.6%
2018-09-154501
 
6.6%
2018-09-093899
 
5.8%
2018-09-163896
 
5.7%
Other values (95)3828
 
5.6%

Most occurring characters

ValueCountFrequency (%)
0150045
22.1%
-135524
20.0%
1133849
19.8%
885400
12.6%
273014
10.8%
971507
10.6%
76815
 
1.0%
36652
 
1.0%
46348
 
0.9%
54531
 
0.7%

Most occurring categories

ValueCountFrequency (%)
Decimal Number542096
80.0%
Dash Punctuation135524
 
20.0%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
0150045
27.7%
1133849
24.7%
885400
15.8%
273014
13.5%
971507
13.2%
76815
 
1.3%
36652
 
1.2%
46348
 
1.2%
54531
 
0.8%
63935
 
0.7%
Dash Punctuation
ValueCountFrequency (%)
-135524
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common677620
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
0150045
22.1%
-135524
20.0%
1133849
19.8%
885400
12.6%
273014
10.8%
971507
10.6%
76815
 
1.0%
36652
 
1.0%
46348
 
0.9%
54531
 
0.7%

Most occurring blocks

ValueCountFrequency (%)
ASCII677620
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
0150045
22.1%
-135524
20.0%
1133849
19.8%
885400
12.6%
273014
10.8%
971507
10.6%
76815
 
1.0%
36652
 
1.0%
46348
 
0.9%
54531
 
0.7%

time
Categorical

HIGH CARDINALITY
UNIFORM

Distinct40788
Distinct (%)60.2%
Missing0
Missing (%)0.0%
Memory size529.5 KiB
15:05:51
 
15
15:05:47
 
13
12:16:42
 
13
02:38:22
 
11
00:26:42
 
11
Other values (40783)
67699 

Length

Max length8
Median length8
Mean length8
Min length8

Characters and Unicode

Total characters542096
Distinct characters11
Distinct categories2 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique24155 ?
Unique (%)35.6%

Sample

1st row12:40:14
2nd row17:26:15
3rd row02:35:04
4th row21:40:17
5th row06:18:38

Common Values

ValueCountFrequency (%)
15:05:5115
 
< 0.1%
15:05:4713
 
< 0.1%
12:16:4213
 
< 0.1%
02:38:2211
 
< 0.1%
00:26:4211
 
< 0.1%
12:16:4411
 
< 0.1%
02:38:4810
 
< 0.1%
06:34:4110
 
< 0.1%
15:09:4910
 
< 0.1%
00:27:0610
 
< 0.1%
Other values (40778)67648
99.8%

Length

Histogram of lengths of the category
ValueCountFrequency (%)
15:05:5115
 
< 0.1%
12:16:4213
 
< 0.1%
15:05:4713
 
< 0.1%
02:38:2211
 
< 0.1%
00:26:4211
 
< 0.1%
12:16:4411
 
< 0.1%
00:27:0010
 
< 0.1%
18:39:3010
 
< 0.1%
14:10:0410
 
< 0.1%
00:26:5810
 
< 0.1%
Other values (40778)67648
99.8%

Most occurring characters

ValueCountFrequency (%)
:135524
25.0%
079929
14.7%
170337
13.0%
251119
 
9.4%
543503
 
8.0%
343121
 
8.0%
442778
 
7.9%
621379
 
3.9%
818269
 
3.4%
918190
 
3.4%

Most occurring categories

ValueCountFrequency (%)
Decimal Number406572
75.0%
Other Punctuation135524
 
25.0%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
079929
19.7%
170337
17.3%
251119
12.6%
543503
10.7%
343121
10.6%
442778
10.5%
621379
 
5.3%
818269
 
4.5%
918190
 
4.5%
717947
 
4.4%
Other Punctuation
ValueCountFrequency (%)
:135524
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common542096
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
:135524
25.0%
079929
14.7%
170337
13.0%
251119
 
9.4%
543503
 
8.0%
343121
 
8.0%
442778
 
7.9%
621379
 
3.9%
818269
 
3.4%
918190
 
3.4%

Most occurring blocks

ValueCountFrequency (%)
ASCII542096
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
:135524
25.0%
079929
14.7%
170337
13.0%
251119
 
9.4%
543503
 
8.0%
343121
 
8.0%
442778
 
7.9%
621379
 
3.9%
818269
 
3.4%
918190
 
3.4%

geo_lat
Real number (ℝ≥0)

HIGH CORRELATION

Distinct30984
Distinct (%)45.7%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean54.33473326
Minimum41.459089
Maximum69.404746
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size529.5 KiB

Quantile statistics

Minimum41.459089
5-th percentile44.7338406
Q153.82923665
median55.4367215
Q357.10704175
95-th percentile59.99494287
Maximum69.404746
Range27.945657
Interquartile range (IQR)3.2778051

Descriptive statistics

Standard deviation5.094782394
Coefficient of variation (CV)0.09376658517
Kurtosis-0.2426889171
Mean54.33473326
Median Absolute Deviation (MAD)1.63131915
Skewness-0.8819143949
Sum3681830.195
Variance25.95680764
MonotonicityNot monotonic
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
55.03039311529
 
2.3%
59.904302925
 
1.4%
59.91383917
 
1.4%
59.882717799
 
1.2%
59.994211577
 
0.9%
60.012588474
 
0.7%
59.885205447
 
0.7%
59.84979433
 
0.6%
55.013916382
 
0.6%
54.9471407374
 
0.6%
Other values (30974)60905
89.9%
ValueCountFrequency (%)
41.4590892
< 0.1%
41.8750941
< 0.1%
42.03730881
< 0.1%
42.03851871
< 0.1%
42.04667591
< 0.1%
42.04669621
< 0.1%
42.0566851
< 0.1%
42.0589662
< 0.1%
42.05917261
< 0.1%
42.0646181
< 0.1%
ValueCountFrequency (%)
69.4047461
< 0.1%
69.40204811
< 0.1%
69.362361
< 0.1%
69.36022421
< 0.1%
69.3552031
< 0.1%
69.34899781
< 0.1%
69.34673081
< 0.1%
69.3454691
< 0.1%
69.3416121
< 0.1%
67.60504711
< 0.1%

geo_lon
Real number (ℝ≥0)

HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION

Distinct30999
Distinct (%)45.7%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean50.30529789
Minimum27.6543701
Maximum129.8489149
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size529.5 KiB

Quantile statistics

Minimum27.6543701
5-th percentile30.274878
Q137.50769892
median39.71583
Q360.74191022
95-th percentile83.0232456
Maximum129.8489149
Range102.1945448
Interquartile range (IQR)23.2342113

Descriptive statistics

Standard deviation20.30577929
Coefficient of variation (CV)0.4036509103
Kurtosis-0.148124825
Mean50.30529789
Median Absolute Deviation (MAD)9.33973365
Skewness1.012094497
Sum3408787.596
Variance412.3246725
MonotonicityNot monotonic
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
83.01554521529
 
2.3%
30.30993917
 
1.4%
30.451298799
 
1.2%
30.47213718
 
1.1%
30.274878577
 
0.9%
30.378822474
 
0.7%
30.370414447
 
0.7%
30.218527433
 
0.6%
83.00329384
 
0.6%
82.9585961374
 
0.6%
Other values (30989)61110
90.2%
ValueCountFrequency (%)
27.65437011
< 0.1%
27.9662881
< 0.1%
28.01120851
< 0.1%
28.0881721
< 0.1%
28.19638211
< 0.1%
28.24449661
< 0.1%
28.25199471
< 0.1%
28.25323751
< 0.1%
28.2635771
< 0.1%
28.2639541
< 0.1%
ValueCountFrequency (%)
129.84891491
< 0.1%
129.8413861
< 0.1%
129.839851
< 0.1%
129.83752951
< 0.1%
129.7872721
< 0.1%
129.77917931
< 0.1%
129.7593971
< 0.1%
129.759342
< 0.1%
129.7579961
< 0.1%
129.7527141
< 0.1%

region
Real number (ℝ≥0)

HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION

Distinct47
Distinct (%)0.1%
Missing1
Missing (%)< 0.1%
Infinite0
Infinite (%)0.0%
Mean4035.560293
Minimum3
Maximum13919
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size529.5 KiB

Quantile statistics

Minimum3
5-th percentile3
Q12661
median2843
Q35368
95-th percentile9654
Maximum13919
Range13916
Interquartile range (IQR)2707

Descriptive statistics

Standard deviation3187.438598
Coefficient of variation (CV)0.7898379324
Kurtosis-0.4224183679
Mean4035.560293
Median Absolute Deviation (MAD)387
Skewness0.8430140455
Sum273453601
Variance10159764.81
MonotonicityNot monotonic
Histogram with fixed size bins (bins=47)
ValueCountFrequency (%)
266112532
18.5%
965412105
17.9%
28439218
13.6%
816501
9.6%
33872
 
5.7%
32302816
 
4.2%
29222447
 
3.6%
61712109
 
3.1%
27221653
 
2.4%
52821577
 
2.3%
Other values (37)12931
19.1%
ValueCountFrequency (%)
33872
 
5.7%
816501
9.6%
1010900
 
1.3%
2072309
 
0.5%
2359262
 
0.4%
259425
 
< 0.1%
26041557
 
2.3%
266112532
18.5%
27221653
 
2.4%
28439218
13.6%
ValueCountFrequency (%)
13919129
 
0.2%
139139
 
< 0.1%
1199169
 
0.1%
1141668
 
0.1%
11171139
 
0.2%
10160313
 
0.5%
965412105
17.9%
9648314
 
0.5%
957994
 
0.1%
850931
 
< 0.1%

building_type
Real number (ℝ≥0)

HIGH CORRELATION
ZEROS

Distinct6
Distinct (%)< 0.1%
Missing1
Missing (%)< 0.1%
Infinite0
Infinite (%)0.0%
Mean1.756142914
Minimum0
Maximum5
Zeros10025
Zeros (%)14.8%
Negative0
Negative (%)0.0%
Memory size529.5 KiB

Quantile statistics

Minimum0
5-th percentile0
Q11
median2
Q33
95-th percentile3
Maximum5
Range5
Interquartile range (IQR)2

Descriptive statistics

Standard deviation1.15426488
Coefficient of variation (CV)0.6572727485
Kurtosis-0.9717106741
Mean1.756142914
Median Absolute Deviation (MAD)1
Skewness0.1081529668
Sum118998
Variance1.332327413
MonotonicityNot monotonic
Histogram with fixed size bins (bins=6)
ValueCountFrequency (%)
122229
32.8%
320653
30.5%
212490
18.4%
010025
14.8%
41990
 
2.9%
5374
 
0.6%
(Missing)1
 
< 0.1%
ValueCountFrequency (%)
010025
14.8%
122229
32.8%
212490
18.4%
320653
30.5%
41990
 
2.9%
5374
 
0.6%
ValueCountFrequency (%)
5374
 
0.6%
41990
 
2.9%
320653
30.5%
212490
18.4%
122229
32.8%
010025
14.8%

level
Real number (ℝ≥0)

HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION

Distinct34
Distinct (%)0.1%
Missing1
Missing (%)< 0.1%
Infinite0
Infinite (%)0.0%
Mean6.270450554
Minimum1
Maximum34
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size529.5 KiB

Quantile statistics

Minimum1
5-th percentile1
Q13
median5
Q39
95-th percentile16
Maximum34
Range33
Interquartile range (IQR)6

Descriptive statistics

Standard deviation4.903953124
Coefficient of variation (CV)0.7820734861
Kurtosis1.710576404
Mean6.270450554
Median Absolute Deviation (MAD)3
Skewness1.333415385
Sum424892
Variance24.04875624
MonotonicityNot monotonic
Histogram with fixed size bins (bins=34)
ValueCountFrequency (%)
28914
13.2%
17814
11.5%
37779
11.5%
46865
10.1%
56661
9.8%
64101
 
6.1%
73949
 
5.8%
83871
 
5.7%
93808
 
5.6%
102955
 
4.4%
Other values (24)11044
16.3%
ValueCountFrequency (%)
17814
11.5%
28914
13.2%
37779
11.5%
46865
10.1%
56661
9.8%
64101
6.1%
73949
5.8%
83871
5.7%
93808
5.6%
102955
 
4.4%
ValueCountFrequency (%)
344
 
< 0.1%
332
 
< 0.1%
324
 
< 0.1%
316
 
< 0.1%
3013
 
< 0.1%
294
 
< 0.1%
288
 
< 0.1%
2719
 
< 0.1%
2637
 
0.1%
25138
0.2%

levels
Real number (ℝ≥0)

HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION

Distinct39
Distinct (%)0.1%
Missing1
Missing (%)< 0.1%
Infinite0
Infinite (%)0.0%
Mean11.51632945
Minimum1
Maximum39
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size529.5 KiB

Quantile statistics

Minimum1
5-th percentile3
Q15
median10
Q316
95-th percentile24
Maximum39
Range38
Interquartile range (IQR)11

Descriptive statistics

Standard deviation6.410053558
Coefficient of variation (CV)0.5566056082
Kurtosis-0.2606603837
Mean11.51632945
Median Absolute Deviation (MAD)5
Skewness0.6969757658
Sum780358
Variance41.08878662
MonotonicityNot monotonic
Histogram with fixed size bins (bins=39)
ValueCountFrequency (%)
512092
17.8%
1011772
17.4%
97985
11.8%
173692
 
5.4%
163660
 
5.4%
182433
 
3.6%
252272
 
3.4%
142223
 
3.3%
122106
 
3.1%
42030
 
3.0%
Other values (29)17496
25.8%
ValueCountFrequency (%)
1264
 
0.4%
21351
 
2.0%
31826
 
2.7%
42030
 
3.0%
512092
17.8%
61488
 
2.2%
7928
 
1.4%
8963
 
1.4%
97985
11.8%
1011772
17.4%
ValueCountFrequency (%)
392
 
< 0.1%
382
 
< 0.1%
379
 
< 0.1%
365
 
< 0.1%
359
 
< 0.1%
348
 
< 0.1%
33134
0.2%
3226
 
< 0.1%
3127
 
< 0.1%
3026
 
< 0.1%

rooms
Real number (ℝ)

HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION

Distinct11
Distinct (%)< 0.1%
Missing1
Missing (%)< 0.1%
Infinite0
Infinite (%)0.0%
Mean1.801272118
Minimum-2
Maximum9
Zeros0
Zeros (%)0.0%
Negative798
Negative (%)1.2%
Memory size529.5 KiB

Quantile statistics

Minimum-2
5-th percentile1
Q11
median2
Q32
95-th percentile3
Maximum9
Range11
Interquartile range (IQR)1

Descriptive statistics

Standard deviation0.9262339033
Coefficient of variation (CV)0.5142109812
Kurtosis0.9952084724
Mean1.801272118
Median Absolute Deviation (MAD)1
Skewness0.4860337747
Sum122056
Variance0.8579092436
MonotonicityNot monotonic
Histogram with fixed size bins (bins=11)
ValueCountFrequency (%)
129197
43.1%
222205
32.8%
313450
19.8%
41733
 
2.6%
-1797
 
1.2%
5334
 
0.5%
628
 
< 0.1%
86
 
< 0.1%
95
 
< 0.1%
75
 
< 0.1%
ValueCountFrequency (%)
-21
 
< 0.1%
-1797
 
1.2%
129197
43.1%
222205
32.8%
313450
19.8%
41733
 
2.6%
5334
 
0.5%
628
 
< 0.1%
75
 
< 0.1%
86
 
< 0.1%
ValueCountFrequency (%)
95
 
< 0.1%
86
 
< 0.1%
75
 
< 0.1%
628
 
< 0.1%
5334
 
0.5%
41733
 
2.6%
313450
19.8%
222205
32.8%
129197
43.1%
-1797
 
1.2%

area
Real number (ℝ≥0)

HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION

Distinct3655
Distinct (%)5.4%
Missing1
Missing (%)< 0.1%
Infinite0
Infinite (%)0.0%
Mean55.02651363
Minimum5
Maximum942
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size529.5 KiB

Quantile statistics

Minimum5
5-th percentile30
Q139
median49
Q364
95-th percentile97
Maximum942
Range937
Interquartile range (IQR)25

Descriptive statistics

Standard deviation26.35769425
Coefficient of variation (CV)0.4789998949
Kurtosis88.17493391
Mean55.02651363
Median Absolute Deviation (MAD)12
Skewness5.314982078
Sum3728651.59
Variance694.728046
MonotonicityNot monotonic
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
441218
 
1.8%
401209
 
1.8%
431187
 
1.8%
451163
 
1.7%
421101
 
1.6%
601076
 
1.6%
33968
 
1.4%
30943
 
1.4%
38928
 
1.4%
32922
 
1.4%
Other values (3645)57046
84.2%
ValueCountFrequency (%)
51
 
< 0.1%
5.63
 
< 0.1%
102
 
< 0.1%
10.51
 
< 0.1%
111
 
< 0.1%
11.51
 
< 0.1%
1214
< 0.1%
1312
< 0.1%
13.51
 
< 0.1%
13.62
 
< 0.1%
ValueCountFrequency (%)
9421
 
< 0.1%
8251
 
< 0.1%
7533
< 0.1%
7001
 
< 0.1%
6001
 
< 0.1%
590.71
 
< 0.1%
5301
 
< 0.1%
5261
 
< 0.1%
5051
 
< 0.1%
4871
 
< 0.1%

kitchen_area
Real number (ℝ≥0)

HIGH CORRELATION
HIGH CORRELATION

Distinct1724
Distinct (%)2.5%
Missing1
Missing (%)< 0.1%
Infinite0
Infinite (%)0.0%
Mean11.06171544
Minimum0.07
Maximum187.17
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size529.5 KiB

Quantile statistics

Minimum0.07
5-th percentile5
Q17
median10
Q313
95-th percentile20.66
Maximum187.17
Range187.1
Interquartile range (IQR)6

Descriptive statistics

Standard deviation5.9410838
Coefficient of variation (CV)0.5370852136
Kurtosis29.33962918
Mean11.06171544
Median Absolute Deviation (MAD)3
Skewness3.270106848
Sum749552.9
Variance35.29647672
MonotonicityNot monotonic
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
66192
 
9.1%
95117
 
7.6%
104581
 
6.8%
83898
 
5.8%
123483
 
5.1%
73325
 
4.9%
53292
 
4.9%
112760
 
4.1%
141895
 
2.8%
131472
 
2.2%
Other values (1714)31746
46.8%
ValueCountFrequency (%)
0.071
 
< 0.1%
0.18
 
< 0.1%
0.111
 
< 0.1%
0.121
 
< 0.1%
0.136
 
< 0.1%
0.21
 
< 0.1%
1212
0.3%
2237
0.3%
2.11
 
< 0.1%
2.25
 
< 0.1%
ValueCountFrequency (%)
187.171
 
< 0.1%
1001
 
< 0.1%
952
 
< 0.1%
881
 
< 0.1%
871
 
< 0.1%
85.51
 
< 0.1%
851
 
< 0.1%
791
 
< 0.1%
789
< 0.1%
76.81
 
< 0.1%

object_type
Categorical

Distinct2
Distinct (%)< 0.1%
Missing1
Missing (%)< 0.1%
Memory size529.5 KiB
1.0
51624 
11.0
16137 

Length

Max length4
Median length3
Mean length3.238145836
Min length3

Characters and Unicode

Total characters219420
Distinct characters3
Distinct categories2 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row1.0
2nd row1.0
3rd row1.0
4th row1.0
5th row1.0

Common Values

ValueCountFrequency (%)
1.051624
76.2%
11.016137
 
23.8%
(Missing)1
 
< 0.1%

Length

Histogram of lengths of the category

Pie chart

ValueCountFrequency (%)
1.051624
76.2%
11.016137
 
23.8%

Most occurring characters

ValueCountFrequency (%)
183898
38.2%
.67761
30.9%
067761
30.9%

Most occurring categories

ValueCountFrequency (%)
Decimal Number151659
69.1%
Other Punctuation67761
30.9%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
183898
55.3%
067761
44.7%
Other Punctuation
ValueCountFrequency (%)
.67761
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common219420
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
183898
38.2%
.67761
30.9%
067761
30.9%

Most occurring blocks

ValueCountFrequency (%)
ASCII219420
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
183898
38.2%
.67761
30.9%
067761
30.9%

Interactions

Correlations

Spearman's ρ

The Spearman's rank correlation coefficient (ρ) is a measure of monotonic correlation between two variables, and is therefore better in catching nonlinear monotonic correlations than Pearson's r. It's value lies between -1 and +1, -1 indicating total negative monotonic correlation, 0 indicating no monotonic correlation and 1 indicating total positive monotonic correlation.

To calculate ρ for two variables X and Y, one divides the covariance of the rank variables of X and Y by the product of their standard deviations.

Pearson's r

The Pearson's correlation coefficient (r) is a measure of linear correlation between two variables. It's value lies between -1 and +1, -1 indicating total negative linear correlation, 0 indicating no linear correlation and 1 indicating total positive linear correlation. Furthermore, r is invariant under separate changes in location and scale of the two variables, implying that for a linear function the angle to the x-axis does not affect r.

To calculate r for two variables X and Y, one divides the covariance of X and Y by the product of their standard deviations.

Kendall's τ

Similarly to Spearman's rank correlation coefficient, the Kendall rank correlation coefficient (τ) measures ordinal association between two variables. It's value lies between -1 and +1, -1 indicating total negative correlation, 0 indicating no correlation and 1 indicating total positive correlation.

To calculate τ for two variables X and Y, one determines the number of concordant and discordant pairs of observations. τ is given by the number of concordant pairs minus the discordant pairs divided by the total number of pairs.

Phik (φk)

Phik (φk) is a new and practical correlation coefficient that works consistently between categorical, ordinal and interval variables, captures non-linear dependency and reverts to the Pearson correlation coefficient in case of a bivariate normal input distribution. There is extensive documentation available here.

Missing values

A simple visualization of nullity by column.
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.
The correlation heatmap measures nullity correlation: how strongly the presence or absence of one variable affects the presence of another.
The dendrogram allows you to more fully correlate variable completion, revealing trends deeper than the pairwise ones visible in the correlation heatmap.

Sample

First rows

Unnamed: 0pricedatetimegeo_latgeo_lonregionbuilding_typelevellevelsroomsareakitchen_areaobject_type
01404039000002018-09-1012:40:1455.78648049.2234592922.01.010.011.03.067.008.801.0
12460842500002018-09-1117:26:1555.90504537.39357881.01.025.025.01.039.0010.501.0
27663643403602018-09-1802:35:0459.88271730.4512982661.00.04.027.01.057.1111.381.0
33194480000002018-09-1221:40:1755.64046237.3594153.01.01.017.03.074.5010.001.0
48242727500002018-09-1806:18:3855.04205382.9409269654.01.01.05.02.044.606.001.0
5154322000002018-09-0805:48:4752.071278113.39598210160.01.01.05.03.069.0010.001.0
61679133000002018-09-1017:11:3859.94578830.2708632661.03.02.05.0-1.027.002.001.0
76212923000002018-09-1710:32:0347.23047039.6175203230.01.03.010.01.040.0010.001.0
88082815000002018-09-1804:57:5463.56170653.6720524417.03.04.04.01.031.006.001.0
95206725990002018-09-1514:54:0356.31493838.13999081.03.01.05.02.045.005.001.0

Last rows

Unnamed: 0pricedatetimegeo_latgeo_lonregionbuilding_typelevellevelsroomsareakitchen_areaobject_type
6775287214500002018-09-0801:51:2354.94694882.9701159654.01.07.010.01.048.9414.0011.0
677533302216000002018-09-1303:40:4955.01324883.0006099654.01.01.010.01.048.9413.7811.0
67754192136000002018-09-0809:09:2755.74478737.99923681.03.018.018.01.039.008.701.0
677556665728000002018-09-1720:42:2145.04016038.9759652843.01.02.09.03.064.009.001.0
67756358922500002018-09-0815:22:1756.79720160.5126476171.01.03.015.01.037.0010.0011.0
677575380679000002018-09-1523:27:5459.98020730.3917622661.03.012.013.02.061.0015.001.0
677583908229500002018-09-1316:53:4755.50009836.03034481.03.04.05.02.044.006.001.0
677596057512410002018-09-1706:56:1154.94604682.9612749654.01.015.018.01.041.3911.0011.0
67760778712500002018-09-0913:30:4061.71154630.7007028090.03.01.05.01.033.007.601.0
677615046765088002018-09-1508:44:5055.67238037.85515681.02.06.033.02.067.8021.0011.0